Efficient top-k processing in large-scaled distributed environments
نویسندگان
چکیده
The rapid development of networking technologies has made it possible to construct a distributed database that involves a huge number of sites. Query processing in such a large-scaled system poses serious challenges beyond the scope of traditional distributed algorithms. In this paper, we propose a new algorithm BRANCA for performing top-k retrieval in these environments. Integrating two orthogonal methodologies ‘‘semantic caching’’ and ‘‘routing indexes’’, BRANCA is able to solve a query by accessing only a small number of servers. Our algorithmic findings are accompanied with a solid theoretical analysis, which rigorously proves the effectiveness of BRANCA. Extensive experiments verify that our technique outperforms the existing methods significantly. 2007 Elsevier B.V. All rights reserved.
منابع مشابه
Efficient Top-k Query Processing Algorithms in Highly Distributed Environments
Efficient top-k query processing in highly distributed environments is a valuable but challenging research topic. This paper focuses on the problem over vertically partitioned data and aims to propose more efficient algorithms.. The effort is put on limiting the data transferred and communication round trips among nodes to reduce the communication cost of the query processing. Two novel algorit...
متن کاملTop-k aggregation queries in large-scale distributed systems
Distributed top-k query processing has become an essential functionality in a large number of emerging application classes like Internet traffic monitoring and Peer-to-Peer Web search. This work addresses efficient algorithms for distributed topk queries in wide-area networks where the index lists for the attribute values (or text terms) of a query are distributed across a number of data peers.
متن کاملE2DR: Energy Efficient Data Replication in Data Grid
Abstract— Data grids are an important branch of gird computing which provide mechanisms for the management of large volumes of distributed data. Energy efficiency has recently emerged as a hot topic in large distributed systems. The development of computing systems is traditionally focused on performance improvements driven by the demand of client's applications in scientific and business domai...
متن کاملProcessing Top-k Queries in Distributed Hash Tables
Distributed Hash Tables (DHTs) provide a scalable solution for data sharing in large scale distributed systems, e.g. P2P systems. However, they only provide good support for exact-match queries, and it is hard to support complex queries such as top-k queries. In this paper, we propose a family of algorithms which deal with efficient processing of top-k queries in DHTs. We evaluated the performa...
متن کاملEfficient Processing of Preference Queries in Distributed and Spatial Databases
Traditional SQL queries are recognized for producing an exact and complete result set. However, for an increasing number of applications that manage massive amounts of data, the large result set produced by traditional SQL queries has become difficult to handle. Therefore, there is an increasing interest in queries that produce a more concise result set. Preference queries capture the wishes of...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Data Knowl. Eng.
دوره 63 شماره
صفحات -
تاریخ انتشار 2007